# Guided Analysis - User Security Metadata (Public Preview)

**Notebook Version:** 1.0  
**Python Version:** Python 3.6  
**Required Packages**: kqlmagic, validate_email, jsonpickle, azure-cli-core, Azure-Sentinel-Utilities  
  
**Platforms Supported**:
- Azure Notebooks Free Compute
- Azure Notebooks DSVM
- OS Independent
  
**Data Sources Required**:
- Log Analytics : UserPeerAnalytics, UserAccessAnalytics

**Permissions Required**:
- **Log Analytics Read Permissions**: To connect and query the workspace you need to be assigned at least [Reader](https://docs.microsoft.com/azure/role-based-access-control/built-in-roles#reader) or [Microsoft Sentinel Reader](https://docs.microsoft.com/azure/role-based-access-control/built-in-roles#azure-sentinel-reader) role on the workspace.
- **Directory Basic Read Permissions** : If you are a user who is a native member of the tenant, then by [default](https://docs.microsoft.com/azure/active-directory/fundamentals/users-default-permissions#compare-member-and-guest-default-permissions) you have permissions to read user, group and serviceprincipal information. If you are a guest user in the tenant, then you need to be assigned [Directory Reader](https://docs.microsoft.com/azure/active-directory/users-groups-roles/directory-assign-admin-roles#directory-readers) role. 

**Description**:  
This notebook introduces the concept of contextual security metadata that are gathered for AAD users. Here are the security metadata that are available* today
- **UserAccessAnalytics**: The most important step of a security incident is to identify the blast radius of the user under investigation. This enrichment data calculates for a given user, the direct or transitive access/permission to resources. In Public Preview, we calculate the blast radius access graph only limited to RBAC access to subscriptions.  For example, if the user under investigation is Jane Smith, Access Graph displays all the Azure subscriptions that she either can access directly, via groups or serviceprincipals. 
- **UserPeerAnalytics**: Analysts frequently use the peers of a user under investigation to scope the security incident. This enrichment data, for a given user, provides a ranked list of peers. For example, if the user is Jane Smith, Peer Enrichment calculates all of Janeâ€™s peers based on her mailing list, security groups, etc and provides the top 20 of her peers. Specifically, this information is calculated using Natural Language Processing algorithms using group membership information from Azure Active Directory.  

This is a Microsoft Sentinel **Public Preview** feature. If you are interested in the above analytics data please contact ramk at microsoft com.

## Contents:  
- [Setup](#setup)
    - [Install Packages](#install)
    - [Enter Tenant and Workspace Ids](#tenant-and-worskpace-ids)
    - [Connect to Log Analytics](#connect-to-la)
    - [Log into Azure CLI](#log-into-azure)
    - [Enter User Information](#user-input)
- [Access Graph of the user](#access-graph)
- [Ranked peers of the user](#user-peers)

<a id='setup'></a>
# Setup
<a id='install'></a>
## Install Packages
The first time this cell runs for a new Azure Notebooks project or local Python environment it will take several minutes to download and install the packages. In subsequent runs it should run quickly and confirm that package dependencies are already installed. Unless you want to upgrade the packages you can feel free to skip execution of the next cell.

In [None]:
print('Please wait. Installing required packages. This may take a few minutes...')
!pip install Kqlmagic --no-cache-dir --upgrade
!pip install validate_email --upgrade
!pip install jsonpickle --upgrade
!pip install azure-cli-core --upgrade
!pip install --upgrade Azure-Sentinel-Utilities
print('Required Package Installation Complete')

<a id='tenant-and-worskpace-ids'></a>
## Enter Tenant and Workspace Ids
You can configure your TenantId and WorskpaceId in config.json file next to the notebook, see sample [here](https://github.com/Azure/Azure-Sentinel/blob/master/Notebooks/config.json). If config.json file is missing then you will be prompted to enter TenantId and WorkspaceId manually.  
To find your WorkspaceId go to [Log Analytics](https://portal.azure.com/#blade/HubsExtension/Resources/resourceType/Microsoft.OperationalInsights%2Fworkspaces), and look at the workspace properties to find the ID.

In [None]:
import os.path
import SentinelUtils

tenantId = None
workspaceId = None
configFile = "config.json"

if os.path.isfile(configFile):
    try: 
        print(f"Read Workspace configuration from local '{configFile}' file... ", end = "")
        tenantId = SentinelUtils.config_reader.ConfigReader.read_config_values(configFile)[0]
        workspaceId = SentinelUtils.config_reader.ConfigReader.read_config_values(configFile)[3]
        print("Done!")
        print(f"Tenant - '{tenantId}' and Log Analytics Workspace - '{workspaceId}' retrieved from {configFile}")
    except:
        pass

if not workspaceId or not tenantId:
    print(f"Unable to retrive tenantId and workspaceid from '{configFile}'.")
    print('Enter Azure TenantId: ')
    tenantId = input().strip()
    print()
    print('Enter Sentinel Workspace Id: ')
    workspaceId = input().strip()
    print()


<a id='connect-to-la'></a>
## Connect to Log Analytics
This is required to read the tables in your log analytics workspace. 

In [None]:
%reload_ext Kqlmagic
%kql loganalytics://code().tenant(tenantId).workspace(workspaceId)

<a id='log-into-azure'></a>
## Log into Azure CLI
Azure CLI is used to retrieve display name and email address of users, groups and service principals from AAD.

In [None]:
!az login --tenant $tenantId
%run Entities.py
%run GraphVis.py

<a id='user-input'></a>
## Enter User Information

In [None]:
from Utils import validatedate
from datetime import date
import ipywidgets as widgets
from IPython.display import display

print('Enter object Id or UPN or email address of the user: ')
userIdOrEmail = input().strip()
print()

if not userIdOrEmail :
    raise Exception("Error: Empty Object Id or UPN or email address.")

print(f'Retrieving user "{userIdOrEmail}" from the tenant...', end = '')
user = User.getUserByIdOrEmail(userIdOrEmail)
print("Done!")
print("Name - {0}, Email - {1}, Id - {2}".format(user.name, user.email, user.objectId))
print()

print('[Optional] Enter date in format yyyy-MM-dd to retrieve analytics from that date. If you want latest, leave it empty and press enter: ')
time  = input().strip()

if not time :
    today = date.today()
    time = today.strftime("%Y-%m-%d")
else:
    validatedate(time)

<a id='access-graph'></a>
# Access Graph of the user:
Run this cell to visualize the access/permissions of the user in a graph. The cell queries the 'UserAccessAnalytics' table to retrieve direct/transitive RBAC access of the user to subscriptions. 

In [None]:
from IPython.display import clear_output, display, HTML

kql_query = f"""
let userId = "{user.objectId}";
let blastRadTime = todatetime('{time}');

let userSubAccess = UserAccessAnalytics
| where SourceEntityId == userId and TargetEntityType == "AzureSubscription" and TimeGenerated <= blastRadTime
| project UserId = SourceEntityId, TimeGenerated , SubscriptionName = TargetEntityName, Subscription = TargetEntityId, Role = AccessLevel, GroupId = "", ServicePrincipalId = ""
| summarize arg_max(TimeGenerated, *) by Subscription, Role;

let userGroupAccess = UserAccessAnalytics
| where SourceEntityId == userId and TargetEntityType == "Group" and TimeGenerated <= blastRadTime
| project UserId = SourceEntityId, GroupId = TargetEntityId, TimeGenerated
| summarize arg_max(TimeGenerated, *) by GroupId;

let userGroupSubAccess = userGroupAccess
| join kind = inner
UserAccessAnalytics
on $left.GroupId == $right.SourceEntityId
| where TargetEntityType == "AzureSubscription" and TimeGenerated <= blastRadTime
| project UserId, GroupId, ServicePrincipalId = "", TimeGenerated, SubscriptionName = TargetEntityName, Subscription = TargetEntityId, Role = AccessLevel
| summarize arg_max(TimeGenerated, *) by GroupId, Subscription, Role;

let userSPAccess = UserAccessAnalytics
| where SourceEntityId == userId and TargetEntityType == "ServicePrincipal" and TimeGenerated <= blastRadTime
| project UserId = SourceEntityId, ServicePrincipalId = TargetEntityId, TimeGenerated
| summarize arg_max(TimeGenerated, *) by ServicePrincipalId;

let userSPSubAccess = userSPAccess
| join kind = inner
UserAccessAnalytics
on $left.ServicePrincipalId == $right.SourceEntityId
| where TargetEntityType == "AzureSubscription" and TimeGenerated <= blastRadTime
| project UserId, GroupId = "", ServicePrincipalId, TimeGenerated, SubscriptionName = TargetEntityName, Subscription = TargetEntityId, Role = AccessLevel
| summarize arg_max(TimeGenerated, *) by ServicePrincipalId, Subscription, Role;

userGroupSubAccess
| union kind=outer userSubAccess
| union kind=outer userSPSubAccess"""

print(f"Executing Kql query to retrieve access analytics for user '{user.name}', on or before '{time}'.. ", end = '')
%kql -query kql_query
print('Done!')

usersubMappings = _kql_raw_result_.to_dataframe()

if len(usersubMappings) == 0:
    print(f"No access analytics data available for user '{user.name}', on or before '{time}'")
else:
    print('Creating Graph visualization. This may take a few seconds.. ', end = '')
    graph = GraphVis()

    for index, row in usersubMappings.iterrows():
        sub = Subscription(row['SubscriptionName'], row['Subscription'])
        rbacRole = row['Role']

        if row['GroupId'] == '' and row['ServicePrincipalId'] == '':
            graph.addEdge(user.getNode(), sub.getNode(), rbacRole)
        elif row['GroupId']:
            group = Group.getGroupById(row['GroupId'])
            graph.addEdge(user.getNode(), group.getNode(), "Member")
            graph.addEdge(group.getNode(), sub.getNode(), rbacRole)
        elif row['ServicePrincipalId']:
            sp = ServicePrincipal.getServicePrincipalById(row['ServicePrincipalId'])
            graph.addEdge(user.getNode(), sp.getNode(), "Owner")
            graph.addEdge(sp.getNode(), sub.getNode(), rbacRole)

    print('Done!')
    display(HTML(graph.getHtml()))


<a id='user-peers'></a>
## Ranked peers of the user
This cell queries the 'UserPeerAnalytics' table to return a ranked list of peers of the user.

In [None]:
from IPython.display import clear_output, display, HTML
import tabulate

kql_query = f"""
let userId = "{user.objectId}";
let snapshotTime = todatetime('{time}');

UserPeerAnalytics
| where UserId == userId 
| join kind = inner
(
UserPeerAnalytics
| where TimeGenerated <= snapshotTime and UserId == userId 
| summarize max(TimeGenerated)
| project TimeGenerated = max_TimeGenerated
)
on TimeGenerated
| project PeerUserId, Rank
| order by Rank asc"""

print(f"Executing Kql query to retrieve peer analytics for user '{user.name}', on or before '{time}'.. ", end = '')
%kql -query kql_query
print('Done!')

peerListDF = _kql_raw_result_.to_dataframe()

peerList = []
peerList.append(["UserName", "PeerUserName", "PeerEmail", "Rank"])

if len(peerListDF) == 0:
    print(f"No peer analytics data available for user '{user.name}', on or before '{time}'")
else:
    print('Retrieving user names and email addresses for peers. This may take a few seconds...', end = '')

    for index, row in peerListDF.iterrows():
        peerUserId = row['PeerUserId']    
        peerRank = row['Rank'] 
        peerUser = User.getUserByIdOrEmail(peerUserId)
        peerList.append([user.name, peerUser.name, peerUser.email, peerRank])

    print('Done!')
    display(HTML(tabulate.tabulate(peerList, tablefmt='html')))